AITopics | natural product

Collaborating Authors

natural product

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Atomic Diffusion Models for Small Molecule Structure Elucidation from NMRSpectra

Neural Information Processing SystemsJun-21-2026, 10:57:25 GMT

Nuclear Magnetic Resonance (NMR) spectroscopy is a cornerstone technique for determining the structures of small molecules and is especially critical in the discovery of novel natural products and clinical therapeutics. Yet, interpreting NMR spectra remains a time-consuming, manual process requiring extensive domain expertise. We introduce CHEFNMR (CHemical Elucidation From NMR), an endto-end framework that directly predicts an unknown molecule's structure solely from its 1DNMR spectra and chemical formula. We frame structure elucidation as conditional generation from an atomic diffusion model built on a non-equivariant transformer architecture. To model the complex chemical groups found in natural products, we generated a dataset of simulated 1DNMR spectra for over 111,000 natural products. CHEFNMR predicts the structures of challenging natural product compounds with an unsurpassed accuracy of over 65%. This work takes a significant step toward solving the grand challenge of automating small-molecule structure elucidation and highlights the potential of deep learning in accelerating molecular discovery.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country: North America > United States (1.00)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government > Regional Government > North America Government > United States Government (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)

Add feedback

Atomic Diffusion Models for Small Molecule Structure Elucidation from NMR Spectra

Neural Information Processing SystemsJun-13-2026, 17:57:12 GMT

Nuclear Magnetic Resonance (NMR) spectroscopy is a cornerstone technique for determining the structures of small molecules and is especially critical in the discovery of novel natural products and clinical therapeutics. Yet, interpreting NMR spectra remains a time-consuming, manual process requiring extensive domain expertise. We introduce ChefNMR (CHemical Elucidation From NMR), an end-to-end framework that directly predicts an unknown molecule's structure solely from its 1D NMR spectra and chemical formula. We frame structure elucidation as conditional generation from an atomic diffusion model built on a non-equivariant transformer architecture. To model the complex chemical groups found in natural products, we generated a dataset of simulated 1D NMR spectra for over 111,000 natural products. ChefNMR predicts the structures of challenging natural product compounds with an unsurpassed accuracy of over 65%. This work takes a significant step toward solving the grand challenge of automating small-molecule structure elucidation and highlights the potential of deep learning in accelerating molecular discovery.

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.60)

Add feedback

Atomic Diffusion Models for Small Molecule Structure Elucidation from NMR Spectra

Xiong, Ziyu, Zhang, Yichi, Alauddin, Foyez, Cheng, Chu Xin, An, Joon Soo, Seyedsayamdost, Mohammad R., Zhong, Ellen D.

arXiv.org Artificial IntelligenceDec-4-2025

Nuclear Magnetic Resonance (NMR) spectroscopy is a cornerstone technique for determining the structures of small molecules and is especially critical in the discovery of novel natural products and clinical therapeutics. Yet, interpreting NMR spectra remains a time-consuming, manual process requiring extensive domain expertise. We introduce ChefNMR (CHemical Elucidation From NMR), an end-to-end framework that directly predicts an unknown molecule's structure solely from its 1D NMR spectra and chemical formula. We frame structure elucidation as conditional generation from an atomic diffusion model built on a non-equivariant transformer architecture. To model the complex chemical groups found in natural products, we generated a dataset of simulated 1D NMR spectra for over 111,000 natural products. ChefNMR predicts the structures of challenging natural product compounds with an unsurpassed accuracy of over 65%. This work takes a significant step toward solving the grand challenge of automating small-molecule structure elucidation and highlights the potential of deep learning in accelerating molecular discovery. Code is available at https://github.com/ml-struct-bio/chefnmr.

artificial intelligence, machine learning, spectra, (17 more...)

arXiv.org Artificial Intelligence

2512.03127

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government > Regional Government > North America Government > United States Government (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

NaFM: Pre-training a Foundation Model for Small-Molecule Natural Products

Ding, Yuheng, Wang, Yusong, Qiang, Bo, Yu, Jie, Li, Qi, Zhou, Yiran, Liu, Zhenmin

arXiv.org Artificial IntelligenceMar-22-2025

Natural products, as metabolites from microorganisms, animals, or plants, exhibit diverse biological activities, making them crucial for drug discovery. Nowadays, existing deep learning methods for natural products research primarily rely on supervised learning approaches designed for specific downstream tasks. However, such one-model-for-a-task paradigm often lacks generalizability and leaves significant room for performance improvement. Additionally, existing molecular characterization methods are not well-suited for the unique tasks associated with natural products. To address these limitations, we have pre-trained a foundation model for natural products based on their unique properties. Our approach employs a novel pretraining strategy that is especially tailored to natural products. By incorporating contrastive learning and masked graph learning objectives, we emphasize evolutional information from molecular scaffolds while capturing side-chain information. Our framework achieves state-of-the-art (SOTA) results in various downstream tasks related to natural product mining and drug discovery. We first compare taxonomy classification with synthesized molecule-focused baselines to demonstrate that current models are inadequate for understanding natural synthesis. Furthermore, by diving into a fine-grained analysis at both the gene and microbial levels, NaFM demonstrates the ability to capture evolutionary information. Eventually, our method is experimented with virtual screening, illustrating informative natural product representations that can lead to more effective identification of potential drug candidates.

artificial intelligence, machine learning, natural product, (18 more...)

arXiv.org Artificial Intelligence

2503.17656

Country:

Asia > China > Shaanxi Province > Xi'an (0.04)
North America > United States > Washington > King County > Seattle (0.04)
Europe > Germany > Rheinland-Pfalz > Mainz (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Accelerating Antibiotic Discovery with Large Language Models and Knowledge Graphs

Delmas, Maxime, Wysocka, Magdalena, Gusicuma, Danilo, Freitas, André

arXiv.org Artificial IntelligenceMar-20-2025

The discovery of novel antibiotics is critical to address the growing antimicrobial resistance (AMR). However, pharmaceutical industries face high costs (over $1 billion), long timelines, and a high failure rate, worsened by the rediscovery of known compounds. We propose an LLM-based pipeline that acts as an alarm system, detecting prior evidence of antibiotic activity to prevent costly rediscoveries. The system integrates organism and chemical literature into a Knowledge Graph (KG), ensuring taxonomic resolution, synonym handling, and multi-level evidence classification. We tested the pipeline on a private list of 73 potential antibiotic-producing organisms, disclosing 12 negative hits for evaluation. The results highlight the effectiveness of the pipeline for evidence reviewing, reducing false negatives, and accelerating decision-making. The KG for negative hits and the user interface for interactive exploration will be made publicly available.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2503.16655

Country:

Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)
Europe > Switzerland (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)
(2 more...)

Genre:

Research Report (0.50)
Overview (0.47)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.49)

Add feedback

FragFM: Efficient Fragment-Based Molecular Generation via Discrete Flow Matching

Lee, Joongwon, Kim, Seonghwan, Kim, Wou Youn

arXiv.org Artificial IntelligenceFeb-19-2025

A BSTRACT We introduce FragFM, a novel fragment-based discrete flow matching framework for molecular graph generation. FragFM generates molecules at the fragment level, leveraging a coarse-to-fine autoencoding mechanism to reconstruct atom-level details. This approach reduces computational complexity while maintaining high chemical validity, enabling more efficient and scalable molecular generation. Notably, FragFM achieves over 99% validity with significantly fewer sampling steps, improving scalability while preserving molecular diversity. These results highlight the potential of fragment-based generative modeling for large-scale, property-aware molecular design, paving the way for more efficient exploration of chemical space. 1 I NTRODUCTION Deep generative models, such as diffusion and flow matching, have demonstrated remarkable success across domains like images (Nichol et al., 2021; Rombach et al., 2022; Ho et al., 2020), text (Li et al., 2022), and videos (Hu & Xu, 2023; Ho et al., 2022). Recently, their application to molecular graph generation has gained attention, where they aim to generate chemically valid molecules by leveraging the structural properties of molecular graphs (Jo et al., 2022; Vignac et al., 2022; Qin et al., 2024). However, existing atom-based generative models face scalability challenges, particularly in generating large and complex molecules. The quadratic growth of edges as graph size increases results in computational inefficiencies. At the same time, the inherent sparsity of chemical bonds makes accurate edge prediction more complex, often leading to unrealistic molecular structures or invalid connectivity constraints (Qin et al., 2023; Chen et al., 2023). Additionally, graph neural networks (GNNs) struggle to capture topological features such as rings and loops, leading to deviations from chemically valid structures. While various methods incorporate auxiliary features (e.g., spectral, ring, and valency information) to mitigate these issues, they do not fully resolve the sparsity and scalability bottlenecks (Vignac et al., 2022).

arxiv preprint arxiv, fragment, molecule, (15 more...)

arXiv.org Artificial Intelligence

2502.15805

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

NPGPT: Natural Product-Like Compound Generation with GPT-based Chemical Language Models

Sakano, Koh, Furui, Kairi, Ohue, Masahito

arXiv.org Machine LearningNov-19-2024

Natural products are substances produced by organisms in nature and often possess biological activity and structural diversity. Drug development based on natural products has been common for many years. However, the intricate structures of these compounds present challenges in terms of structure determination and synthesis, particularly compared to the efficiency of high-throughput screening of synthetic compounds. In recent years, deep learning-based methods have been applied to the generation of molecules. In this study, we trained chemical language models on a natural product dataset and generated natural product-like compounds. The results showed that the distribution of the compounds generated was similar to that of natural products. We also evaluated the effectiveness of the generated compounds as drug candidates. Our method can be used to explore the vast chemical space and reduce the time and cost of drug discovery of natural products.

compound, molecule, natural product, (13 more...)

arXiv.org Machine Learning

2411.12886

Country:

Europe > Germany > Rheinland-Pfalz > Mainz (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)

Genre: Research Report > New Finding (0.75)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ShEPhERD: Diffusing shape, electrostatics, and pharmacophores for bioisosteric drug design

Adams, Keir, Abeywardane, Kento, Fromer, Jenna, Coley, Connor W.

arXiv.org Artificial IntelligenceOct-22-2024

Engineering molecules to exhibit precise 3D intermolecular interactions with their environment forms the basis of chemical design. In ligand-based drug design, bioisosteric analogues of known bioactive hits are often identified by virtually screening chemical libraries with shape, electrostatic, and pharmacophore similarity scoring functions. We instead hypothesize that a generative model which learns the joint distribution over 3D molecular structures and their interaction profiles may facilitate 3D interaction-aware chemical design. We specifically design ShEPhERD, an SE(3)-equivariant diffusion model which jointly diffuses/denoises 3D molecular graphs and representations of their shapes, electrostatic potential surfaces, and (directional) pharmacophores to/from Gaussian noise. Inspired by traditional ligand discovery, we compose 3D similarity scoring functions to assess ShEPhERD's ability to conditionally generate novel molecules with desired interaction profiles. We demonstrate ShEPhERD's potential for impact via exemplary drug design tasks including natural product ligand hopping, protein-blind bioactive hit diversification, and bioisosteric fragment merging.

artificial intelligence, machine learning, molecule, (19 more...)

arXiv.org Artificial Intelligence

2411.0413

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > New Mexico > Santa Fe County > Santa Fe (0.04)
Europe > Netherlands > South Holland > Delft (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

How A.I. Teaches Machines to Discover Drugs

The New YorkerSep-2-2024, 10:00:00 GMT

When I first became a doctor, I cared for an older man whom I'll call Ted. He was so sick with pneumonia that he was struggling to breathe. His primary-care physician had prescribed one antibiotic after another, but his symptoms had only worsened; by the time I saw him in the hospital, he had a high fever and was coughing up blood. His lungs seemed to be infected with methicillin-resistant Staphylococcus aureus (MRSA), a bacterium so hardy that few drugs can kill it. I placed an oxygen tube in his nostrils, and one of my colleagues inserted an I.V. into his arm. We decided to give him vancomycin, a last line of defense against otherwise untreatable infections.

artificial intelligence, machine learning, medicine, (19 more...)

The New Yorker

Country:

Europe > Jersey (0.15)
North America > United States > New Jersey (0.05)
South America > Argentina (0.04)
(6 more...)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.50)

Add feedback

Domain-Agnostic Molecular Generation with Self-feedback

Fang, Yin, Zhang, Ningyu, Chen, Zhuo, Guo, Lingbing, Fan, Xiaohui, Chen, Huajun

arXiv.org Artificial IntelligenceOct-2-2023

The generation of molecules with desired properties has gained tremendous popularity, revolutionizing the way scientists design molecular structures and providing valuable support for chemical and drug design. However, despite the potential of language models in molecule generation, they face numerous challenges such as the generation of syntactically or chemically flawed molecules, narrow domain focus, and limitations in creating diverse and directionally feasible molecules due to a dearth of annotated data or external molecular databases. To tackle these challenges, we introduce MolGen, a pre-trained molecular language model tailored specifically for molecule generation. Through the reconstruction of over 100 million molecular SELFIES, MolGen internalizes profound structural and grammatical insights. This is further enhanced by domain-agnostic molecular prefix tuning, fostering robust knowledge transfer across diverse domains. Importantly, our self-feedback paradigm steers the model away from ``molecular hallucinations'', ensuring alignment between the model's estimated probabilities and real-world chemical preferences. Extensive experiments on well-known benchmarks underscore MolGen's optimization capabilities in properties such as penalized logP, QED, and molecular docking. Additional analyses affirm its proficiency in accurately capturing molecule distributions, discerning intricate structural patterns, and efficiently exploring the chemical space. Code is available at https://github.com/zjunlp/MolGen.

dataset, language model, molecule, (16 more...)

arXiv.org Artificial Intelligence

2301.11259

Country:

South America > Colombia > Meta Department > Villavicencio (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback